Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR
نویسندگان
چکیده
Sparse Classification (SC) is an exemplar-based approach to Automatic Speech Recognition. By representing noisy speech as a sparse linear combination of speech and noise exemplars, SC allows separating speech from noise. The approach has shown its robustness in noisy conditions, but at the cost of degradation in clean conditions. In this work, rather than using the state probability estimates obtained with SC directly in a Viterbi decoding, the probability distributions of SC are modeled by Gaussian Mixture Models (GMMs), for which purpose we introduce a novel whitening transformation. Results on the AURORA2 task show that our proposed approach is especially effective in clean speech and in the matched noise conditions in test set A. Except in the -5 dB SNR condition we also find substantial improvements in the non-matched noise conditions in test set B.
منابع مشابه
A Novel Noise-Robust Texture Classification Method Using Joint Multiscale LBP
In this paper we describe a novel noise-robust texture classification method using joint multiscale local binary pattern. The first step in texture classification is to describe the texture by extracting different features. So far, several methods have been developed for this topic, one of the most popular ones is Local Binary Pattern (LBP) method and its variants such as Completed Local Binary...
متن کاملNoise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory
This article proposes and evaluates various methods to integrate the concept of bidirectional Long Short-Term Memory (BLSTM) temporal context modeling into a system for automatic speech recognition (ASR) in noisy and reverberated environments. Building on recent advances in Long Short-Term Memory architectures for ASR, we design a novel front-end for contextsensitive Tandem feature extraction a...
متن کاملExploring Low-Dimensional Structures of Modulation Spectra for Robust Speech Recognition
Developments of noise robustness techniques are vital to the success of automatic speech recognition (ASR) systems in face of varying sources of environmental interference. Recent studies have shown that exploring low-dimensional structures of speech features can yield good robustness. Along this vein, research on low-rank representation (LRR), which considers the intrinsic structures of speech...
متن کاملFace Recognition using an Affine Sparse Coding approach
Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image and video. Sparse coding has increasing attraction for image classification applications in recent years. But in the cases where we have some similar images from different classes, such as face recognition applications, different images may be classified into the same class, and hen...
متن کاملRobust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations
of the Dissertation Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations by Lee Ngee Tan Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2014 Professor Abeer Alwan, Chair This dissertation focuses on algorithms for robust speech and bird song processing. Many applications perform well under ideal signal conditions,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012